智能论文笔记

Deep Learning Generates Synthetic Cancer Histology for Explainability and Education

James M. Dolezal , Rachelle Wolk , Hanna M. Hieromnimon , Frederick M. Howard , Andrew Srisuwananukorn , Dmitry Karpeyev , Siddhi Ramesh , Sara Kochanny , Jung Woo Kwon , Meghana Agni

分类：计算机视觉

2022-11-12

Artificial intelligence methods including deep neural networks (DNN) can provide rapid molecular classification of tumors from routine histology with accuracy that matches or exceeds human pathologists. Discerning how neural networks make their predictions remains a significant challenge, but explainability tools help provide insights into what models have learned when corresponding histologic features are poorly defined. Here, we present a method for improving explainability of DNN models using synthetic histology generated by a conditional generative adversarial network (cGAN). We show that cGANs generate high-quality synthetic histology images that can be leveraged for explaining DNN models trained to classify molecularly-subtyped tumors, exposing histologic features associated with molecular state. Fine-tuning synthetic histology through class and layer blending illustrates nuanced morphologic differences between tumor subtypes. Finally, we demonstrate the use of synthetic histology for augmenting pathologist-in-training education, showing that these intuitive visualizations can reinforce and improve understanding of histologic manifestations of tumor biology.

translated by 谷歌翻译

Learning to Follow Instructions in Text-Based Games

Mathieu Tuli , Andrew C. Li , Pashootan Vaezipoor , Toryn Q. Klassen , Scott Sanner , Sheila A. McIlraith

分类：机器学习 | 人工智能 | 自然语言处理

2022-11-08

Text-based games present a unique class of sequential decision making problem in which agents interact with a partially observable, simulated environment via actions and observations conveyed through natural language. Such observations typically include instructions that, in a reinforcement learning (RL) setting, can directly or indirectly guide a player towards completing reward-worthy tasks. In this work, we study the ability of RL agents to follow such instructions. We conduct experiments that show that the performance of state-of-the-art text-based game agents is largely unaffected by the presence or absence of such instructions, and that these agents are typically unable to execute tasks to completion. To further study and address the task of instruction following, we equip RL agents with an internal structured representation of natural language instructions in the form of Linear Temporal Logic (LTL), a formal language that is increasingly used for temporally extended reward specification in RL. Our framework both supports and highlights the benefit of understanding the temporal semantics of instructions and in measuring progress towards achievement of such a temporally extended behaviour. Experiments with 500+ games in TextWorld demonstrate the superior performance of our approach.

translated by 谷歌翻译

Learn2Reg: comprehensive multi-task medical image registration challenge, dataset and evaluation in the era of deep learning

Alessa Hering , Lasse Hansen , Tony C. W. Mok , Albert C. S. Chung , Hanna Siebert , Stephanie Häger , Annkristin Lange , Sven Kuckertz , Stefan Heldmann , Wei Shao

分类：计算机视觉

2021-12-08

迄今为止，迄今为止，众所周知，对广泛的互补临床相关任务进行了全面比较了医学图像登记方法。这限制了采用研究进展，以防止竞争方法的公平基准。在过去五年内已经探讨了许多新的学习方法，但优化，建筑或度量战略的问题非常适合仍然是开放的。 Learn2reg涵盖了广泛的解剖学：脑，腹部和胸部，方式：超声波，CT，MRI，群体：患者内部和患者内部和监督水平。我们为3D注册的培训和验证建立了较低的入境障碍，这帮助我们从20多个独特的团队中汇编了65多个单独的方法提交的结果。我们的互补度量集，包括稳健性，准确性，合理性和速度，使得能够独特地位了解当前的医学图像登记现状。进一步分析监督问题的转移性，偏见和重要性，主要是基于深度学习的方法的优越性，并将新的研究方向开放到利用GPU加速的常规优化的混合方法。

translated by 谷歌翻译

Tracking the Dynamics of the Tear Film Lipid Layer

Tejasvi Kothapalli , Charlie Shou , Jennifer Ding , Jiayun Wang , Andrew D. Graham , Tatyana Svitova , Stella X. Yu , Meng C. Lin

分类：计算机视觉

2022-12-07

Dry Eye Disease (DED) is one of the most common ocular diseases: over five percent of US adults suffer from DED. Tear film instability is a known factor for DED, and is thought to be regulated in large part by the thin lipid layer that covers and stabilizes the tear film. In order to aid eye related disease diagnosis, this work proposes a novel paradigm in using computer vision techniques to numerically analyze the tear film lipid layer (TFLL) spread. Eleven videos of the tear film lipid layer spread are collected with a micro-interferometer and a subset are annotated. A tracking algorithm relying on various pillar computer vision techniques is developed. Our method can be found at https://easytear-dev.github.io/.

translated by 谷歌翻译

Device Modeling Bias in ReRAM-based Neural Network Simulations

Osama Yousuf , Imtiaz Hossen , Matthew W. Daniels , Martin Lueker-Boden , Andrew Dienstfrey , Gina C. Adam

分类：机器学习

2022-11-29

Data-driven modeling approaches such as jump tables are promising techniques to model populations of resistive random-access memory (ReRAM) or other emerging memory devices for hardware neural network simulations. As these tables rely on data interpolation, this work explores the open questions about their fidelity in relation to the stochastic device behavior they model. We study how various jump table device models impact the attained network performance estimates, a concept we define as modeling bias. Two methods of jump table device modeling, binning and Optuna-optimized binning, are explored using synthetic data with known distributions for benchmarking purposes, as well as experimental data obtained from TiOx ReRAM devices. Results on a multi-layer perceptron trained on MNIST show that device models based on binning can behave unpredictably particularly at low number of points in the device dataset, sometimes over-promising, sometimes under-promising target network accuracy. This paper also proposes device level metrics that indicate similar trends with the modeling bias metric at the network level. The proposed approach opens the possibility for future investigations into statistical device models with better performance, as well as experimentally verified modeling bias in different in-memory computing and neural network architectures.

translated by 谷歌翻译

An Empirical Study on Clustering Pretrained Embeddings: Is Deep Strictly Better?

Tyler R. Scott , Ting Liu , Michael C. Mozer , Andrew C. Gallagher

分类：计算机视觉 | 机器学习

2022-11-09

Recent research in clustering face embeddings has found that unsupervised, shallow, heuristic-based methods -- including $k$-means and hierarchical agglomerative clustering -- underperform supervised, deep, inductive methods. While the reported improvements are indeed impressive, experiments are mostly limited to face datasets, where the clustered embeddings are highly discriminative or well-separated by class (Recall@1 above 90% and often nearing ceiling), and the experimental methodology seemingly favors the deep methods. We conduct a large-scale empirical study of 17 clustering methods across three datasets and obtain several robust findings. Notably, deep methods are surprisingly fragile for embeddings with more uncertainty, where they match or even perform worse than shallow, heuristic-based methods. When embeddings are highly discriminative, deep methods do outperform the baselines, consistent with past results, but the margin between methods is much smaller than previously reported. We believe our benchmarks broaden the scope of supervised clustering methods beyond the face domain and can serve as a foundation on which these methods could be improved. To enable reproducibility, we include all necessary details in the appendices, and plan to release the code.

translated by 谷歌翻译

Talking Head from Speech Audio using a Pre-trained Image Generator

Mohammed M. Alghamdi , He Wang , Andrew J. Bulpitt , David C. Hogg

分类：计算机视觉

2022-09-09

我们提出了一种新颖的方法，用于生成语音音频和单个“身份”图像的高分辨率视频。我们的方法基于卷积神经网络模型，该模型结合了预训练的样式Gener。我们将每个帧建模为Stylegan潜在空间中的一个点，以便视频对应于潜在空间的轨迹。培训网络分为两个阶段。第一阶段是根据语音话语调节潜在空间中的轨迹。为此，我们使用现有的编码器倒转发电机，将每个视频框架映射到潜在空间中。我们训练一个经常性的神经网络，以从语音话语绘制到图像发生器潜在空间中的位移。这些位移是相对于从训练数据集中所描绘的个体选择的身份图像的潜在空间的反向预测的。在第二阶段，我们通过在单个图像或任何选择的身份的简短视频上调整图像生成器来提高生成视频的视觉质量。我们对标准度量（PSNR，SSIM，FID和LMD）的模型进行评估，并表明它在两个常用数据集之一上的最新方法明显优于最新的最新方法，另一方面给出了可比的性能。最后，我们报告了验证模型组成部分的消融实验。可以在https://mohammedalghamdi.github.io/talking-heads-acm-mm上找到实验的代码和视频

translated by 谷歌翻译

Concept Gradient: Concept-based Interpretation Without Linear Assumption

Andrew Bai , Chih-Kuan Yeh , Pradeep Ravikumar , Neil Y. C. Lin , Cho-Jui Hsieh

分类：机器学习

2022-08-31

基于概念的黑框模型的解释通常更为直观，让人类理解。基于概念的解释最广泛采用的方法是概念激活向量（CAV）。CAV依靠学习给定模型和概念的某些潜在表示之间的线性关系。线性可分离性通常是隐式假定的，但通常不正确。在这项工作中，我们从基于概念的解释和提出的概念梯度（CG）的最初意图开始，将基于概念的解释扩展到线性概念功能之外。我们表明，对于一般（潜在的非线性）概念，我们可以数学上评估如何影响模型预测的概念的小变化，从而导致基于梯度的解释扩展到概念空间。我们从经验上证明，在玩具示例和现实世界数据集中，CG表现优于CAV。

translated by 谷歌翻译

HTML版本

A Semi-automatic Cell Tracking Process Towards Completing the 4D Atlas of C. elegans Development

Andrew Lauziere , Ryan Christensen , Hari Shroff

分类：计算机视觉

2022-07-27

线虫秀丽隐杆线虫（秀丽隐杆线虫）被用作模型生物体，以更好地了解发育生物学和神经生物学。秀丽隐杆线虫具有不变的细胞谱系，已使用荧光显微镜图像进行了分类和观察。然而，一旦开始零星的肌肉抽搐，已建立的跟踪细胞的方法就无法概括。我们以方法为基础，该方法将皮肤细胞用作基准标记，尽管随机抽搐，但仍在进行细胞跟踪。特别是，我们提出了一个细胞核分割和跟踪程序，该过程被整合到3D渲染GUI中，以提高在晚期发育过程中跟踪细胞的效率。在三个测试胚胎上描述上述肌肉细胞核的图像上的结果表明，基准标记与经典的跟踪范式结合使用，克服了零星的抽搐。

translated by 谷歌翻译

Language models show human-like content effects on reasoning

Ishita Dasgupta , Andrew K. Lampinen , Stephanie C. Y. Chan , Antonia Creswell , Dharshan Kumaran , James L. McClelland , Felix Hill

分类：自然语言处理 | 人工智能 | 机器学习

2022-07-14

抽象推理是智能系统的关键能力。大型语言模型在抽象推理任务上实现了高度的性能，但表现出许多缺陷。但是，人类的抽象推理也是不完美的，并且取决于我们对推理问题内容的知识和信念。例如，人类对在日常情况下基于逻辑规则的逻辑规则比关于抽象属性的任意规则更可靠地理解。语言模型的培训经验类似地赋予了他们先前的期望，这些期望反映了人类的知识和信念。因此，我们假设语言模型会显示出类似人类的内容对抽象推理问题的影响。我们在三个逻辑推理任务中探讨了这一假设：自然语言推论，判断三段论的逻辑有效性和ison选择任务（Wason，1968）。我们发现，最新的大语言模型（具有7或700亿个参数； Hoffman等，2022）反映了这些任务中人类在人类中观察到的许多相同模式 - 像人类一样，模型对可信情况的理由更有效地理由不现实或抽象的。我们的发现对理解这些认知效应以及有助于语言模型表现的因素具有影响。

translated by 谷歌翻译